Versions:

0.8.119.102
0.8.117.0
0.8.115.60133
0.8.113.28694
0.8.103.41180
0.8.101.42987
0.8.94.38902
0.7.120.15250
0.7.117.26375
0.6.87.59034
0.6.82.49157
0.5.117.32380
0.4.91.9885
0.3.9267.43123

Foundry Local, published by Microsoft Corporation, is a lightweight runtime designed to execute Small Language Models entirely on local hardware, eliminating the need for cloud connectivity while preserving full data privacy. Currently at version 0.8.119.102 and offered in 14 incremental releases, the utility translates the cloud-native capabilities of Azure AI Foundry into an on-device experience, enabling developers, researchers, and privacy-conscious users to run large language models on laptops, workstations, or edge servers without transmitting prompts or outputs externally. By exposing an OpenAI-compatible REST endpoint, the software lets existing applications, chat front-ends, or automation scripts switch from remote GPT endpoints to a local model with only a configuration change, while built-in hardware acceleration automatically taps available CPU, GPU, or NPU resources for optimal token generation speed. Typical use cases include offline coding assistants, confidential document analyzers, secure customer-support bots, and low-latency embedded AI inside factory or healthcare PCs where regulatory constraints forbid cloud processing. Because every inference remains on the device, sensitive source code, medical records, or proprietary business data never leave the machine, simplifying compliance with GDPR, HIPAA, or corporate secrecy policies. The package ships as a compact Windows service that downloads compatible model weights on first run, automatically selects the best execution provider, and exposes Swagger documentation for rapid prototyping. Foundry Local is available for free on get.nero.com, with downloads provided via trusted Windows package sources such as winget, always delivering the latest version, and supporting batch installation of multiple applications.

Tags:

agent 38

ai 171

artificial-intelligence 6

cpu 15

cuda 5

gpu 21

inference 3

llm 100

machine-learning 11

ml 2

npu 3

slm 1